Timothy W Russell*, Joel Hellewell, Sam Abbott, Nick Holding, Hamish Gibbs, Christopher I Jarvis, Kevin Van Zandvoort, CMMID COVID-19 working group, Stefan Flasche, Rosalind M Eggo, W John Edmunds, Adam J Kucharski

authors contributed equally

* corresponding author

Last Updated: 2020-04-30

Aim

To estimate the percentage of symptomatic COVID-19 cases reported in different countries using case fatality ratio estimates based on data from the ECDC, correcting for delays between confirmation-and-death.

Methods Summary

Current estimates for percentage of symptomatic cases reported for countries with greater than ten deaths

Temporal variation

_Figure 1: Temporal variation in reporting rate. We calculate the percentage of symptomatic cases reported on each day a country has had more than ten deaths. We then fit a Generalised Additive Model to these data (see Temporal variation model fitting section for details), highlighting the temporal trend of each countries reporting rate. The shaded region is the 95% CI of fitted GAM._

Figure 1: Temporal variation in reporting rate. We calculate the percentage of symptomatic cases reported on each day a country has had more than ten deaths. We then fit a Generalised Additive Model to these data (see Temporal variation model fitting section for details), highlighting the temporal trend of each countries reporting rate. The shaded region is the 95% CI of fitted GAM.

Current estimates

Figure 2: Plotting the estimates for the proportion of symptomatic cases reported in different countries using cCFR estimates. Blue shading is the 2.5% - 97.5% confidence range.

Table of current estimates

Country Percentage of symptomatic cases reported (95% CI) Total cases Total deaths
Afghanistan 18% (12% - 44%) 1949 60
Albania 21% (13% - 50%) 766 31
Algeria 6.3% (5% - 12%) 3848 444
Andorra 18% (12% - 39%) 753 42
Argentina 15% (11% - 29%) 4272 214
Armenia 50% (30% - 100%) 1932 30
Australia 84% (59% - 100%) 6746 90
Austria 29% (23% - 46%) 15364 580
Azerbaijan 63% (36% - 100%) 1766 23
Bahamas 6% (3.2% - 20%) 80 11
Bangladesh 18% (13% - 45%) 7103 163
Belarus 73% (50% - 100%) 13181 84
Belgium 5.4% (4.6% - 9.3%) 47859 7501
Bolivia 11% (7.2% - 26%) 1110 59
Bosnia and Herzegovina 22% (15% - 47%) 1588 62
Brazil 7.6% (6.4% - 15%) 78162 5466
Bulgaria 16% (11% - 36%) 1447 64
Burkina Faso 14% (9.3% - 31%) 641 43
Cameroon 20% (13% - 45%) 1832 61
Canada 12% (10% - 23%) 51587 2996
Chile 50% (37% - 100%) 14885 216
China 24% (20% - 32%) 83944 4637
Colombia 15% (11% - 30%) 6211 278
Cote dIvoire 63% (32% - 100%) 1238 14
Croatia 30% (20% - 61%) 2062 67
Cuba 18% (12% - 43%) 1467 58
Cyprus 40% (23% - 100%) 843 20
Czechia 33% (25% - 59%) 7579 227
Democratic Republic of the Congo 12% (7.2% - 30%) 500 31
Denmark 18% (14% - 32%) 9008 443
Dominican Republic 16% (13% - 33%) 6652 293
Ecuador 13% (10% - 26%) 24675 883
Egypt 8.8% (6.9% - 18%) 5268 380
Estonia 33% (22% - 69%) 1666 50
Finland 20% (15% - 38%) 4906 206
France 5.1% (4.3% - 8.2%) 128442 24087
Germany 25% (21% - 40%) 159119 6288
Ghana 62% (33% - 100%) 1671 16
Greece 19% (14% - 34%) 2576 139
Guatemala 20% (11% - 67%) 585 16
Guernsey 19% (9.9% - 56%) 251 13
Honduras 7.6% (5.3% - 16%) 771 71
Hungary 6.8% (5.3% - 13%) 2775 312
India 17% (14% - 35%) 33050 1074
Indonesia 8.7% (7% - 17%) 9771 784
Iran 15% (13% - 24%) 93657 5957
Iraq 19% (13% - 37%) 2003 92
Ireland 13% (11% - 24%) 20253 1190
Isle of Man 14% (8.1% - 37%) 313 21
Israel 68% (51% - 100%) 15834 215
Italy 7.3% (6.2% - 11%) 203591 27682
Japan 26% (21% - 51%) 14088 415
Jersey 12% (7.2% - 32%) 286 21
Kazakhstan 77% (45% - 100%) 3205 25
Kenya 19% (10% - 59%) 384 15
Kosovo 26% (15% - 74%) 799 22
Kuwait 87% (50% - 100%) 3740 24
Latvia 54% (28% - 100%) 849 15
Lebanon 32% (19% - 74%) 721 24
Liberia 5.5% (3.2% - 18%) 141 16
Lithuania 31% (20% - 67%) 1375 45
Luxembourg 45% (31% - 84%) 3769 89
Malaysia 59% (42% - 100%) 5945 100
Mali 9.7% (5.9% - 29%) 482 25
Mexico 5.4% (4.4% - 11%) 17799 1732
Moldova 23% (17% - 51%) 3771 111
Morocco 18% (13% - 38%) 4321 168
Netherlands 7.4% (6.1% - 12%) 38802 4711
New Zealand 65% (36% - 100%) 1129 19
Niger 20% (12% - 48%) 713 32
Nigeria 15% (9.9% - 40%) 1728 51
North Macedonia 17% (11% - 36%) 1442 73
Norway 40% (30% - 69%) 7667 202
Pakistan 27% (21% - 56%) 15759 346
Panama 27% (20% - 54%) 6378 178
Peru 18% (14% - 38%) 33931 943
Philippines 12% (9.4% - 22%) 8212 558
Poland 15% (12% - 29%) 12640 624
Portugal 22% (18% - 39%) 24505 973
Puerto Rico 14% (9.6% - 29%) 1433 86
Romania 14% (11% - 25%) 11978 675
Russia 48% (39% - 100%) 99399 972
San Marino 12% (8% - 27%) 563 41
Saudi Arabia 68% (50% - 100%) 21402 157
Serbia 36% (27% - 76%) 8724 173
Singapore 100% (100% - 100%) 15641 14
Sint Maarten 5.1% (2.9% - 16%) 76 13
Slovakia 54% (31% - 100%) 1391 22
Slovenia 17% (12% - 31%) 1418 89
Somalia 7.7% (4.8% - 26%) 582 28
South Africa 35% (25% - 74%) 5350 103
South Korea 55% (42% - 83%) 10765 247
Spain 8.5% (7.2% - 14%) 212917 24275
Sudan 3.9% (2.5% - 13%) 375 28
Sweden 6.3% (5.2% - 11%) 20302 2462
Switzerland 22% (18% - 35%) 29324 1407
Thailand 58% (38% - 100%) 2954 54
Tunisia 23% (15% - 51%) 980 40
Turkey 28% (24% - 53%) 117589 3081
Ukraine 24% (18% - 53%) 9866 250
United Arab Emirates 74% (52% - 100%) 11929 98
United Kingdom 4.8% (4% - 8.4%) 165221 26097
United Republic of Tanzania 12% (6.4% - 42%) 480 16
United States of America 13% (11% - 23%) 1039909 60966
Uruguay 40% (21% - 100%) 630 15

Table 1: Estimates for the proportion of symptomatic cases reported in different countries using cCFR estimates based on case and death timeseries data from the ECDC. Total cases and deaths in each country is also shown. Confidence intervals calculated using an exact binomial test with 95% significance.

Adjusting for outcome delay in CFR estimates

During an outbreak, the naive CFR (nCFR), i.e. the ratio of reported deaths date to reported cases to date, will underestimate the true CFR because the outcome (recovery or death) is not known for all cases [5]. We can therefore estimate the true denominator for the CFR (i.e. the number of cases with known outcomes) by accounting for the delay from confirmation-to-death [1].

We assumed the delay from confirmation-to-death followed the same distribution as estimated hospitalisation-to-death, based on data from the COVID-19 outbreak in Wuhan, China, between the 17th December 2019 and the 22th January 2020, accounting right-censoring in the data as a result of as-yet-unknown disease outcomes (Figure 1, panels A and B in [7]). The distribution used is a Lognormal fit, has a mean delay of 13 days and a standard deviation of 12.7 days [7].

To correct the CFR, we use the case and death incidence data to estimate the proportion of cases with known outcomes [1,6]:

\[ u_{t} = \frac{ \sum_{j = 0}^{t} c_{t-j} f_j}{c_t}, \]

where \(u_t\) represents the underestimation of the proportion of cases with known outcomes [1,5,6] and is used to scale the value of the cumulative number of cases in the denominator in the calculation of the cCFR, \(c_{t}\) is the daily case incidence at time, \(t\) and \(f_t\) is the proportion of cases with delay of \(t\) between confirmation and death.

Approximating the proportion of symptomatic cases reported

At this stage, raw estimates of the CFR of COVID-19 correcting for delay to outcome, but not under-reporting, have been calculated. These estimates range between 1% and 1.5% [1–3]. We assume a CFR of 1.4% (95% CrI: 1.2-1.7%), taken from a recent large study [3], as a baseline CFR. We use it to approximate the potential level of under-reporting in each country. Specifically, we perform the calculation \(\frac{1.4\%}{\text{cCFR}}\) of each country to estimate an approximate fraction of cases reported.

Temporal variation model fitting

We estimate the level of under-reporting on every day for each country that has had more than ten deaths. We then fit a Generalised Additive Model (GAM) of the form \[ \mathbb{E}[\log(D)] = \beta_0 + \beta_1 x_1 + ... + \beta_p x_p,\] specifying a Poisson distribution on deaths (D) as the response variable. The model has a log-link function and a log-offset (\(\kappa\)) consisting of the daily known-outcomes \(u_t\) and the cCFR estimate for that country on that day \(\text{cfr}_t\). The model can then be written as \[ D \sim s(t) + \underbrace{\log(u_t c_t) + \log(\text{cfr}_t)}_{:=log(κ)} \] where \(s(t)\) is a smoothing spline, fitted through the time points (days) for which we have data.

Limitations

Implicit in assuming that the under-reporting is \(\frac{1.4\%}{\text{cCFR}}\) for a given country is that the deviation away from the assumed 1.4% CFR is entirely down to under-reporting. In reality, burden on healthcare system is a likely contributing factor to higher than 1.4% CFR estimates, along with many other country specific factors.

The following is a list of the other prominent assumptions made in our analysis:

Code and data availability

The code is publically available at https://github.com/thimotei/CFR_calculation. The data required for this analysis is a time-series for both cases and deaths, along with the corresponding delay distribution. We scrape this data from ECDC, using the NCoVUtils package [8].

References

1 Russell TW, Hellewell J, Jarvis CI et al. Estimating the infection and case fatality ratio for covid-19 using age-adjusted data from the outbreak on the diamond princess cruise ship. medRxiv 2020.

2 Verity R, Okell LC, Dorigatti I et al. Estimates of the severity of covid-19 disease. medRxiv 2020.

3 Guan W-j, Ni Z-y, Hu Y et al. Clinical characteristics of coronavirus disease 2019 in china. New England Journal of Medicine 2020.

4 Shim E, Mizumoto K, Choi W et al. Estimating the risk of covid-19 death during the course of the outbreak in korea, february-march, 2020. medRxiv 2020.

5 Kucharski AJ, Edmunds WJ. Case fatality rate for ebola virus disease in west africa. The Lancet 2014;384:1260.

6 Nishiura H, Klinkenberg D, Roberts M et al. Early epidemiological assessment of the virulence of emerging infectious diseases: A case study of an influenza pandemic. PLoS One 2009;4.

7 Linton NM, Kobayashi T, Yang Y et al. Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: A statistical analysis of publicly available case data. Journal of Clinical Medicine 2020;9:538.

8 Abbott S MJ Hellewell J. NCoVUtils: Utility functions for the 2019-ncov outbreak. doi:105281/zenodo3635417 2020.